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1. Introduction 


Deep learning is a part or subset of machine learning under a domain of artificial intelligence that is capable of learning data 
from unsupervised data. Deep learning is also known as deep neural or deep neural network. Deep learning makes use of both 
structured and unstructured data. Deep learning is used in virtual assistants, driverless cars, and face recognition. Social dis- 
tancing, one of the most important rules to tackle the global pandemic, COVID-19. The use of technology can ease the ability 
to maintain and regulate people into following social distancing. One of the biggest causes in which COVID spreads is contact 
and if people can avoid contact it will cut down the rise of COVID-19 and hence save lives. To be able to use technology and 
help is a boon. The objective of this paper is to present the use of real time object detection in real world scenarios. This paper 
presents a view to be able to use the YOLOv3 for the detection of real time objects and calculate the distance between the 
detected bounding boxes to find the violation in the social distance. 

Social distancing is definitely the foremost trustworthy technique to prevent the spreading of communicable disease, with 
this belief, within the background of December 2019, [1] once COVID-19 emerged in Wuhan, China, it had been opted as AN 


new live on January twenty three, 2020. Inside one month, the natural event in China gained a peak within the 1st week of Feb 
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with two,000 to 4,000 new confirmed cases per day. Later, for the primary time once this natural event, there is a signal of 
relief with no new confirmed cases for 5 consecutive days up to twenty three March 2020. This can be evident from the social 
distancing measures enacted in China at the start, adopted worldwide later to regulate COVID-19. Prem et al. aimed to check 
the consequences of social distancing measures on the unfold of the COVID-19 epidemic. Authors used artificial location-spe- 
cific contact patterns to simulate the continuing mechanical phenomenon of the natural event exploitation susceptible-ex- 
posed-infected-removed (SEIR) models. It had been additionally advised that premature and explosive lifting of social distanc- 
ing could lead on to an earlier secondary peak, that can be planar by quiet the interventions bit by bit. As we tend to all perceive, 
social distancing although essential however economically painful measures to flatten the infection curve. Adolph et al. high- 
lighted the case of the u.s. of America, wherever because of lack of common consent among all policymakers it couldn't be 
adopted at AN early stage, that is ensuing into on-going damage to public health. although social distancing compact economic 
productivity, several researchers try onerous to beat the loss. Following from this context, throwing stick et al. studied the 
correlation between the strictness of social distancing and therefore the economic standing of the region. 

The study indicated that intermediate levels of activities can be allowable whereas avoiding an enormous natural event. 
Since the novel coronavirus pandemic began, several countries are taking the assistance of technology primarily based solu- 
tions in numerous capacities to contain the natural event. several developed countries, as well as India and Republic of Korea, 
for example, utilising GPS to trace the movements of the suspected or infected persons to watch any risk of their exposure 
among healthy individuals. In India, the govt. is exploiting the Arogya Setu App, that worked with the assistance of GPS and 
bluetooth to find the presence of COVID-19 patients within the locality space. It additionally helps others to stay a secure 
distance from the infected person. On the opposite hand, some enforcement departments are exploiting drones and alterna- 
tive police work cameras to notice mass gatherings of individuals, and taking restrictive actions to disperse the group. Such 
manual intervention in these crucial things may facilitate flatten the curve, however it additionally brings a novel set of threats 
to the general public and is difficult to manpower. Human detection exploitation visual closed-circuit television is a longtime 
space of analysis that is relying upon manual ways of distinctive uncommon activities, however, it's restricted capabilities. 
During this direction, recent advancements advocate the necessity for intelligent systems to notice and capture human activi- 
ties. though human detection is AN formidable goal, because of a range of constraints like low-resolution video, varied articu- 
lated cause, clothing, lighting and background complexities and restricted machine vision capabilities, whereby previous infor- 
mation on these challenges will improve the detection performance. sleuthing AN object that is in motion, incorporates 2 
stages: object detection and object classification. 

The first stage of object detection can be achieved by exploitation background subtraction, optical flow and spatiotemporal 
filtering techniques. within the background subtraction methodology, the distinction between this frame and a background 
frame (first frame), at constituent or block level is computed. adaptational Gaussian mixture, temporal differencing, class- 
conscious background models, warp background and non-parametric background at the foremost well-liked approaches of 
background subtraction. In optical flow-based object detection technique, flow vectors related to the object's motion are char- 
acterized over a time span so as to spot regions in motion for a given sequence of pictures. Researchers rumored that optical 
flow primarily based techniques incorporate process overheads and are sensitive to varied motion connected outliers like 
noise, color and lighting, etc. In another methodology of motion detection Aslani et al. planned spatio-temporal filter primarily 
based approach during which the motion parameters are known by exploiting three-dimensional (3D) spatio-temporal options 
of the person in motion within the image sequence. These ways are advantageous because of its simplicity and fewer process 
complexes, but shows restricted performance as a result of noise and uncertainties on moving patterns. [2] Y. Sahraoui, C. A. 
Kerrache, A. Korichi, B. Nour, A. Adnane and R. Hussain, "DeepDist: A Deep-Learning-Based loV Framework for Real-Time Ob- 


jects and Distance Violation Detection," presented this paper which shows the use of faster RCNN to measure and detect the 
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social distancing. Object detection issues are with efficiency addressed by recently developed advanced techniques. Within the 
last decade, convolutional neural networks (CNN), region-based CNN and quicker region-based CNN used region proposal tech- 
niques to come up with the objectness score before its classification and later generates the bounding boxes round the object 
of interest for visualisation and alternative applied mathematics analysis. Though these ways are economical, however they 
suffer in terms of larger coaching time necessities. Since these CNN primarily based approaches utilize classification, another 
approach YOLO considers a regression primarily based methodology to dimensionally separate the bounding boxes and inter- 
pret their category chances. During this methodology, the designed framework with efficiency divides the image into many 
parts representing bounding boxes beside the category likelihood scores for every portion to think about as AN object. 

This approach offers wonderful enhancements in terms of soeed whereas mercantilism gains speed with the potency. The 
detector module exhibits powerful generalization capabilities of representing a whole image. supported the higher than ideas, 
several analysis findings are rumored within the previous couple of years. Crowd count emerged as a promising space of anal- 
ysis, with several social group applications. Eshel et al., targeted on crowd detection and person count by proposing multiple 
height homographies for head high detection and solved the occlusions drawback related to video police work connected 
applications. Chen et al. developed AN electronic advertising application supporting the idea of crowd count. In a similar appli- 
cation, Chih-Wen et al. planned a vision-based individuals count model. [3] F. A. A. Nagqiyuddin, W. Mansor, N. M. Sallehuddin, 
M. N.S. Mohd Johari, M. A. S. Shazlan and A. N. Bakar, "Wearable Social Distancing Detection System," presented this paper 
which made use of real time object detection and internet of things to detect social distancing violdation. [4]. A. H. Ahamad, 
N. Zaini and M. F. A. Latip, "Person Detection for Social Distancing and Safety Violation Alert based on Segmented ROI," pre- 
sented this paper which is based on the implementation of the use of SSD for the measure and detection of social distancing 
algorithm. [5] S. Gupta, R. Kapil, G. Kanahasabai, S. S. Joshi and A. S. Joshi presented a paper as “SD-Measure: A Social Distancing 


Detector” which made use of the R CNN algorithm for social distancing detection. There are number of algorithms [6-8]. 


2. Advantages of deep learning models 


Image classification involves distributing a category label to a picture, whereas object localization involves drawing a bounding 
box around one or a lot of objects in a picture. Object detection is tougher and combines these 2 tasks and attracts a bounding 
box around every object of interest within the image and assigns them a category label. Together, all of those issues square 
measure cited as seeing. 
1. Deep learning models have the capability to generate new features from the limited dataset that they initially were 
trained with. 
The models on being trained continuously become flexible and adapt to change quickly. 
The deep learning models are capable of learning through unlabelled data which reduces a lot of cost for labelling 
data to help the machine to learn easily as in case of supervised learning. 
4. Deep learning models once trained correctly are capable of performing repetitions without taking much time and do 


not get tired. 


3. Object recognition with deep learning 


First of all, what is object recognition? Object Recognition is the term which refers to the vision of the computer or artificial 
model which identifies objects in digital photographs. Image classification involves distinguishing the class of one object in an 
image. Object localization is referring to or identifying one or more objects in an image and making that image part stand out 


for identification by drawing boxes around the images. 
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The three computer vision tasks: 
1. Image classification 
2. Object Localization 
3. Object Detection 


Image Classification is predicting the which class or type the object in an image falls into 
1. Input- a digital image is provided with the object for detection, such as a photograph of a person. 
2. Output- A label is mapped to the image. 
Object Localization - the object in the image is located and then made to stand out or identified by making a box around the 
object. 
1. Input- an image is provided with one or more identifiable objects, like a photograph. 
2. Output- one or more bounding boxes 
Object detection is locating the images with the bounding box and the types of the objects located in the image. 
1. Input- An image which contains one or more identifiable objects. 


2. Output- one or more bounding boxes and labels for each bounding box. 


1. HOG (Histogram of orientated Gradients) feature Extractor and SVM (Support Vector Machine) model: Before the age 
of deep learning, it absolutely was a progressive methodology for object detection. It takes bar graph descriptors of 
each positive (those pictures that contain objects) and negative (that image that doesn't contain objects) samples and 
trains our SVM model on it. 

2. Bag of options model: rather like a bag of words considers document as associate orderless assortment of words, this 
approach conjointly represents a picture as associate orderless assortment of image options. samples of this are SIFT, 
MSER, etc. 

3. Viola-Jones algorithmic rule: This algorithm is widely used for face detection within the image or period. It performs 
Haar-like feature extraction from the image. This generates an outsized range of options. These options are then 
passed into a boosting classifier. This generates a cascade of the boosted classifier to perform image detection. a 
picture must pass to every of the classifiers to get a positive (face found) result. The advantage of Viola-Jones is that 


it's a detection time of two FPS which might be utilized in a period face recognition system. 


4. You Look Only Once 


YOLO is one of the algorithms for real time object detection. YOLO is one of the most known or effective algorithms for real 
time object detection. 
There are a few different algorithms for object detection which can be basically divided into two: 
1. Algorithms based on regression: The algorithms based on regression use classes and bounding boxes for the image. 
The two best known algorithms for this division are YOLO and SSD. 
2. Algorithms based on classification: These algorithms are a bit slower than the prior ones because they first select 
respected regions and then they classify these regions using CNN. Some of the widely known algorithms based on 


classification. 


Why choose YOLO over other algorithms? 


YOLO is more popular than other algorithms because it is able to achieve more accuracy while also running in real time. 
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There are a lot of algorithms for object detection, to a name some of them are: 
1. Fast R- CNN 

Faster R- CNN 

Histogram of oriented Gradients (HOG) 

Region - based Convolutional Neural Networks (R- CNN) 

Region - based Fully Convolutional Network (R - FCN) 

Single Shot Detector (SSD) 

Spatial Pyramid Pooling (SPP-net) 


Se ee 


Fast R- CNN: The Fast R-CNN is written in C++ and Python, this algorithm covers the disadvantages associated with R-CNN. 
Using R-CNN can be advantageous because of the fact that the training can be done in a single stage which prevents multi- 
stage task loss. It suppresses the use of storing in the disk for caching. 

Histogram of Oriented Gradients: The HOG algorithm uses techniques such as the region of interest, sliding detection win- 
dow and others for the detection of the objects in image processing. One of the advantages of using this algorithm is that it is 
simple and easy to understand. 

Single Shot Detector: The method that the algorithm single shot detector uses is the single deep neural network. The ap- 
proach of this algorithm is to separate the output space of the bounding box into a set of boxes with different aspect ratio. 
Then a scaling method scales the processing into the map location. Using SSD or Single Shot Detector can be advantageous as 
SSD eliminates the generation and subsequent pixel or feature stages and combines all the processed computation in a single 


network. 


YOLO algorithm amongst the various algorithms for Real Time Object Detection is used because of the various challenges: 
1. Unknown number of objects: 
The problem in object detection is that locating and classifying different objects in an Image can be a difficult task. 
2. Object classification and localization: 
They both add to the problem of Real Time Object detection as in Real Time Object Detection there is a need to not only 
identify the object but also to determine where the object is based, that is its position, location etc. 
3. The need for speed in Real Time Object detection: 
The algorithms that are used in Real Time Object Detection need not only accuracy but they also have to be fast enough to 


keep up with the speed of doing things. 
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Figure 1. various classes for identification and representing various objects in YOLO 
The image above portrays the various classes for identification and representing various objects in YOLO. 


5. Open CV 


OpenCV stands for OpenSource Computer Vision. OpenCV or OpenSource Computer Vision library holds an ocean of functions 
for Computer Vision. OpenCV or OpenSource Computer Vision contains more than 2000 algorithms. The algorithms that are 
present in OpenCV or OpenSource Computer Vision can be used to detect and recognize different objects such as humans, 
their faces, hand gestures, movements etc. These algorithms can be used to track camera movements, produce 3d models of 
objects, find images that are similar from a given dataset. With the help of OpenCV it allows to read and write images, allows 
to capture videos, process digital images, perform detection of features, detect objects such as faces, eyes, material objects 
etc. 


OpenCV Social Distancing Detector Steps 






Step #1: Object 

Input Detection 
Image/Frame (Filtering ONLY 
“People” Class) 


Step #2: Compute 
Pairwise Distances 
Between Centroids 







Step #3: Check 
Distance Matrix for 
People < N Pixels 

Apart 


Show results 





Figure 2. Block diagram of OpenCV social distancing detector 


6. Approach for Social Distance Detection 


1. The human movement is detected with the help of YOLO or You Look Only Once. The movements are then tracked in 
the frame. 

2. The Euclidean distance is calculated between each individual and appropriate points of interests are maintained by 
dismissing the ones no longer in use and associating the new ones with the new distance. 
The boxes then show the critical conditions of whether the individuals are at risk or are safe. 
This project is based on the combination of three parts of Artificial intelligence which are object detection, object 
tracking and calculation of distance between the objects identified. 

5. The main class used here in this project is the person class. YOLO has more than 9000 classes out of which one is the 


most required in this project. 


7. Detection 
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Then comes the part of detection of the human movement within the frame for the Real Time Object Detection. To detect the 
movement of the pedestrians, a bounding box is required for all the pedestrians so that their movement can be tracked from 
one point to another. Every time a new person appears in the frame a new ID is assigned to the object. If for continuous 50 
frames an object does not change position, the ID is re-registered. For detection an ID is assigned to each of the persons. Every 


time an object moves from one position to another the ID is re-registered. 





Figure 3. Detection of objects with the dots representing the centroids 


The image above portrays the detection of objects with the dots representing the centroids. 





Figure 4. Use of ID and bounding box around objects 


The image above portrays the use of ID and bounding box around objects for the identification and recognition of the object. 


8. Distance Calculation 
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After bounding all the movement within the box, the distance needs to be calculated in order to make the decision of whether 
an individual is practising or following social distancing or not. Two important points which include important properties are: 
1. The dimension of the object to be identified. That is centimeters, millimeters etc. 
2. The image should be easily identifiable. That is either by its appearance or location. 
A lot of necessary python packages need to be installed in order to measure the distance between the objects. Some of the 
necessary packages are: 


1. Scipy.spatial and distance 


2. Imutils and perspective 
3. Imutils and contours 

4. CV2 

5. Argparse 

6. Numpy 


OpenCV was started at Intel in 1999 by urban center Bradsky. Vadim Pisarevsky joined urban center Bradsky to manage Intel's 
Russian software package OpenCV team. In 2005, OpenCV was used on Stanley, the vehicle that won the 2005 authority Grand 
Challenge. Later, its active development continued beneath the support of Willow Garage with urban center Bradsky and Vadim 
Pisarevsky leading the project. OpenCV currently supports a mess of algorithms associated with pc Vision and Machine Learning 
and is increasing day by day. OpenCV supports a good sort of programming languages like C++, Python, Java, etc., and is acces- 
sible on completely different platforms together with Windows, Linux, OS X, Android, and iOS. Interfaces for high-speed GPU 
operations supported CUDA and OpenCl also are beneath active development. OpenCV-Python is the Python API for OpenCV, 
combining the simplest qualities of the OpenCV C++ API and also the Python language. 


distance 


focal length 





Figure 5. The formulation of equation 


After which the class ID and confidence that is the probability is calculated. There can be two cases with this that there is an 
existing class ID or a new ID will be created. If an existing ID is matched then the confidence or the probability is calculated 
again and the confidence needs to be above the minimum conf. Then the boxes are drawn around the image. Non - maxima 


suppression helps in determining the perfect box that is drawn around the object. 


9. Prediction on the basis of the distance calculated 
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After bounding all the movement within the box, the distance needs to be calculated in order to make the decision of whether 
an individual is practicing or following social distancing or not. Two important points which include important properties are: 
1. The dimension of the object to be identified. That is centimeters, millimeters etc. 
2. The image should be easily identifiable. That is either by its appearance or location. 


The distance library is used for calculating the Euclidean distance between the people for the detection of the appropriate 


distance between them. The Euclidean distance is then measured and checked if the distance is greater than the minimum 


distance. 





Social Distancing Violations SEA 


Figure 6. The number of social distancing violations 


In the above image all the people in the red box are violating social distancing. The total number of violations are also portrayed 


in the image above. 


12. Limitation of Social Distancing Detector 


There are a few limitations of the social distancing detector which are listed as below: 
1. The first thing that can be improved is the camera calibration, which eases the mapping of the distance in pixels to 
the actual units of measurements of distance. 
2. The second thing that can be improved is the camera angle, a better approach would be to use the bird's eye view 


angle which is also known as the top- down transformation. 
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